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(54) System and method of implementing disk ownership in networked storage 



ten to a specific location on each disk and the second 
ownership attribute is setting a SCSI-3 persistent reser- 
vation. In a system utilizing this disk ownership method, 
multiple file servers can read data from a given disk, but 
only the file server that owns a particular disk can write 
data to the disk. 




Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 



(57) A system and method for disk ownership in a 
network storage system. Each disk has two ownership 
attributes set to show that a particular file server owns 
the disk. In a preferred embodiment the first ownership 
attribute is the serial number of the file server being writ- 
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This capability can be utilized to generate redundant da- 
ta pathways to a disk. 

[0009] Each of the devices attached to the LAN in- 
clude an appropriate conventional network interface ar- 
rangement (not shewn) for communicating over the LAN 5 
using desired communication protocol such as the well- 
known Transport Control Protocol/Internet Protocol 
(TCP/IP), User Datagram Protocol (UDP), Hypertext 
Transfer Protocol (HTTP), or Simple Network Manage- 
ment Protocol (SNMP). 10 
[0010] One prior implementation of a storage system 
involves the use of switch zoning. Instead of the filer be- 
ing directly connected to the fibre channel loop, the filer 
would be connected to a fibre channel switch, which 
would then be connected to a plurality of fibre channel *s 
loops. Switch zoning is accomplished within the fibre 
channel switches by manually associating ports of the 
switch. This association with, and among, the ports 
would allow a filer connected to a port associated with 
a port connected to a fibre channel loop containing disks 20 
to "see" the disks within that loop. That is, the disks are 
visible to that port. However, a disadvantage of the 
switch zoning methodology was that a filer could only 
see what was within its zone. A zone is defined as all 
devices that are connected to ports associated with the 25 
port to which the filer was connected. Another noted dis- 
advantage of this switch zoning method is that if zoning 
needs to be modified, an interruption of service occurs 
as the switches must be taken off-line to modify zoning. 
Any device attached to one particular zone can only be 30 
owned by another device within that zone. It is possible 
to have multiple filers within a single zone; however, 
ownership issues then arise as to the disks within that 
zone. 

[0011]. The need, thus, arises for a technique for a filer 35 
to determine which disks it owns other than through a 
hardware mechanism and zoning contained within a 
switch. This disk ownership in a networked storage 
methodology would permit easier scalability of net- 
worked storage solutions. 40 

Summary of the Invention 



ncted that other forms of persistent reservations can be 
usee in accordance with the invention. For example, if 
a SCSI level 4 command set is generated that includes 
persistent reservations operating like those contained 
wimin the SCSI-3 command, these new reservations 
are expressly contemplated to be used in accordance 
with :he invention. 

[0013] By utilizing this ownership system and method, 
any number of file servers connected to a switching net- 
work can read from, but not write to, all of the disks con- 
nected to the switching network. In general, this novel 
ownership system and method enables any number of 
file servers to be connected to one or more switches or- 
ganized as a switching fabric with each file server being 
able to read data from all of the disks connected to the 
switching fabric. Only the file server that presently owns 
a particular disk can write to a given disk. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] The above and further advantages of the in- 
vention may be better understood by referring to the fol- 
lowing description in conjunction with the accompanying 
drawings in which like reference numerals indicate iden- 
tical or functionally similar elements: 

Fig. 1 , already described, is a schematic block dia- 
gram of a network environment showing the prior 
art of a filer directly connected to fibre channel loop; 

Fig. 2 is a schematic block diagram of a network 
environment including various network devices in- 
cluding exemplary file servers and associated vol- 
umes; 

Fig. 3 is a schematic block diagram of an exemplary 
storage appliance in accordance with Fig. 2; 

Fig. 4 is a schematic block diagram of a storage op- 
erating system for use with the exemplary file server 
of Fig. 3 according to an embodiment of this inven- 
tion; 



[0012] One aspect of the invention overcomes the dis- Fig. 5 is a block diagram of an ownership table 

advantages of the prior art by providing a system and 45 maintained by the ownership layer of the storage 
method of implementing disk ownership by respective operating system of Fig. 4 in accordance with an 

file servers without the need for direct physical connec- embodiment of this invention; and 

tion or switch zoning within fibre channel (or other) 

switches. A two-part ownership identification system Fig. 6 is a flow chart detailing the steps performed 

and method is defined. The first part of this ownership so by the storage operating system upon boot up to 
method is the writing of ownership information to a pre- obtain ownership information of ail disks connected 

determined area of each disk. Within the system, this to fibre channel switches connected to the individ- 

ownership information acts as the definitive ownership ual filer, 

attribute. The second part of the ownership method is 
the setting of a SCSI-3 persistent reservation to allow 55 
only the disk owner to write to the disk. This use of a 
SCSI-3 persistent reservation allows other filers to read 
the ownership information from the disks. It should be 
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purpose computer's microprocessor identification 
number the file server's media access code (MAC) ad- 
dress, etc. 

[0023] In the illustrative embodiment, the memory 304 
may have storage locations that are addressable by the 
processor for storing software program code or data 
structures associated with the present invention. The 
processor and adapters may, in turn, comprise process- 
ing elements and/or logic circuitry configured to execute 
the software code and manipulate the data structures. 
The storage operating system 31 2, portions of which are 
typically resident in memory and executed by the 
processing elements, functionally organize a file server 
by inter-alia invoking storage operations in support of a 
file service implemented by the file server. It will be ap- 
parent by those skilled in the art that other processing 
and memory implementations, including various com- 
puter readable media may be used for storing and exe- 
cuting program instructions pertaining to the inventive 
technique described herein. 

[0024] The network adapter 306 comprises the me- 
chanical, electrical and signaling circuitry needed to 
connect the file server to a client over the computer net- 
work, which as described generally above, can com- 
prise a point-to-point connection or a shared medium 
such as a LAN. A client can be a general-purpose com- 
puter configured to execute applications including file 
system protocols, such as the Network File System 
(NFS) or the Common Internet File System (CIFS) pro- 
tocol. Moreover, the client can interact with the file serv- 
er in accordance with the client/server model of infor- 
mation delivery. The storage adapter cooperates with 
the storage operating system 312 executing in the file 
server to access information requested by the client. 
The information may be stored in a number of storage 
volumes (Volume 0 and Volume 1) each constructed 
from an array of physical disks that are organized as 
RAID groups (RAID GROUPS 1, 2 and 3). The RAID 
groups include independent physical disks including 
those storing a striped data and those storing separate 
parity data. In accordance with a preferred embodiment 
RAID 4 is used. However, other configurations (e.g., 
RAID 5) are also contemplated. 

[0025] The storage adapter 308 includes .input/output 
interface circuitry that couples to the disks over an I/O 
interconnect arrangement such as a conventional high- 
speed/high-performance fibre channel serial link topol- 
ogy. The information is retrieved by the storage adapter, 
and if necessary, processed by the processor (or the 
adapter itself) prior to being forwarded over the system 
bus to the network adapter, where the information is for- 
matted into a packet and returned to the client. 
[0026] To facilitate access to the disks, the storage op- 
erating system implements a file system that logically 
organizes the information as a hierarchical structure of 
directories in files on the disks. Each on-disk file may be 
implemented as a set of disk blocks configured to store 
information such as text, whereas the directory may be 



implemented as a specially formatted file in which other 
files and directories are stored. In the illustrative embod- 
iment described herein, the storage operating system 
associated with each voiume is preferably the NetApp® 

s Data CNTAP storage operating sysiem available from 
Network Appliance Inc. of Sunnyvale, California that im- 
plements a Write Anywhere File Layout (WAFL) file sys- 
tem. The preferred storage operating system for the ex- 
emplary file server is now described briefly. However, it 

10 is expressly contemplated that the principles of this in- 
vention can be implemented using a variety of alternate 
storage operating system architectures. 
. [0027] The host adapter 316, which is connected to 
the storage adapter of the file server, provides the file 

? 5 server with a unique world wide name, described further 
below. 

C. Storage Operating System and Disk Ownership 

20 [0028] As shown in Fig. 4, the storage operating sys- 
tem 31 2 comprises a series of software layers including 
a media access layer 402 of network drivers (e.g., an 
Ethernet driver). The storage operating system further 
includes network protocol layers such as the Internet 

25 Protocol (IP) layer 404 and its Transport Control Proto- 
col (TCP) layer 406 and a User Datagram Protocol 
(UDP) layer 408. A file system protocol layer provides 
multi-protocol data access and, to that end, includes 
support from the CIFS protocol 41 0, the Network File 

30 System (NFS) protocol 412 and the Hypertext Transfer 
Protocol (HTTP) protocol 41 4. 

[0029] In addition, the storage operating system 312 
includes a disk storage layer 41 6 that implements a disk 
storage protocol such as a RAID protocol, and a disk 

35 driver layer 41 8 that implements a disk access protocol 
such as e.g., a Small Computer System Interface (SCSI) 
protocol. Included within the disk storage layer 41 6 is a 
disk ownership layer 420, which manages the owner- 
ship of the disks to their related volumes. Notably, the 

40 disk ownership layer includes program instructions for 
writing the proper ownership information to sector S and 
to the SCSI reservation tags. 

[0030] As used herein, the term "storage operating 
system" generally refers to the computer-executable 

45 code operable on a storage system that implements file 
system semantics (such as the above-referenced 
WAFL) and manages data access. In this sense, ON- 
TAP software is an example of such a storage operating 
system implemented as a microkernel. The storage op- 

50 erating system can also be implemented as an applica- 
tion program operating over a general-purpose operat- 
ing system, such as UNIX® or Windows NT®, or as a 
general-purpose operating system with configurable 
functionality, which is configured for storage applica- 

55 tions as described herein. 

[0031] Bridging the disk software layers, with the net- 
work and file system protocol layers, is a file system lay- 
er 424 of the storage operating system. Generally, the 
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invention can be implemented as a computer program, 
the present invention encompasses any suitable carrier 
medium carrying the computer program for input to and 
execution by a computer. The carrier medium can com- 
prise a transient carrier medium such as a signal e.g. 
an electrical, optical, microwave, magnetic, electromag- 
netic or acoustic signal, or a storage medium e.g. a flop- 
py disk, hard disk, optical disk, magnetic tape, or solid 
state memory device. Additionally,, it is expressly con- 
templated that other devices connected to a network 
can have ownership of a disk in a network environment. 
Accordingly, this description is meant to be taken only 
by way of example and not to otherwise limit the scope 
of this invention. 



Claims 

1 . A method for a network device to claim ownership 
of a disk in a network storage system comprising 
the steps of: 

setting a first ownership attribute on the disk to 
a state of ownership by network device; and 
setting a second ownership attribute on the disk 
to a state of ownership by network device. 

2. The method of claim 1 , wherein one of the first own- 
ership attribute and the second ownership attribute 
further comprises a small computer system inter- 
face level 3 persistent reservation tag. 

3. The method of claim 1 , wherein one of the first own- 
ership attribute and the second ownership attribute 
further comprises ownership information written on 
a predetermined area of the disk. 

4. The method of claim 3, wherein the ownership in- 
formation further comprises a serial number of the 
network device. 40 

5. The method of any preceding claim, wherein the 
network device comprises a file server. 

6. A method of claiming ownership of a disk by a net- 45 
work device in a network storage system compris- 
ing the steps of: 

writing ownership information to a predeter- 
mined area of the disk; and so 
setting a small computer system interface level 
3 persistent reservation tag to a state of net- 
work device ownership. 

7. The method of claim 6 wherein the ownership infor- 55 
mation further comprises a serial number of a net- 
work device. 



8. The method of claim 5 or claim 7, wherein the net- 
work device comprises a file server. 

9. A network storage system comprising: 

a plurality of network devices; 
one or more switches, each network device 
connected to at least one of the one or more 
switch; and 

a plurality of disks having a first ownership at- 
tribute and a second ownership attribute, each 
disk connected to at least one of the plurality of 
switches. 

1 0. The network storage system of claim 9, wherein the 
first ownership attribute further comprises owner- 
ship information written on a predetermined area of 
the disk. 



20 11. The network storage system of claim 9 or claim 1 0, 
wherein the second ownership attribute further 
comprises a small computer system interface level 
3 persistent reservation tag. 
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1 2. The networked storage system of claim 1 1 , wherein 
each disk that is owned by the network device has 
the small computer system interface level 3 persist- 
ent reservation set such that only the network de- 
vice may write to the disk. 

13. The network storage system of claim 10, wherein 
the ownership information further comprises of a 
serial number of the network device that owns that 
particular disk. 

1 4. The network storage system of any one of claims 9 
to 13, wherein each of the plurality of file servers 
can read data from each of the plurality of disks. 

15. The network storage system of any one of claims 9 
to 14, wherein only a network device that owns one 
of the plurality of disks can write data to the one disk. 

1 6. The network storage system of any one of claims 9 
to 15, wherein the network devices comprise file 
servers. 

17. A network storage system comprising: 

one or more switches; 
a plurality of disks; and 

a plurality of network devices, each of the net- 
work devices including means for claiming 
ownership of one of the plurality of disks in the 
network storage system. 

18. The network storage system of claim 17, wherein 
the means for claiming ownership further compris- 
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